pip install geopandas
Requirement already satisfied: geopandas in c:\users\swati\anaconda3\lib\site-packages (0.14.4) Requirement already satisfied: fiona>=1.8.21 in c:\users\swati\anaconda3\lib\site-packages (from geopandas) (1.9.6) Requirement already satisfied: numpy>=1.22 in c:\users\swati\anaconda3\lib\site-packages (from geopandas) (1.26.4) Requirement already satisfied: packaging in c:\users\swati\anaconda3\lib\site-packages (from geopandas) (23.1) Requirement already satisfied: pandas>=1.4.0 in c:\users\swati\anaconda3\lib\site-packages (from geopandas) (2.1.4) Requirement already satisfied: pyproj>=3.3.0 in c:\users\swati\anaconda3\lib\site-packages (from geopandas) (3.6.1) Requirement already satisfied: shapely>=1.8.0 in c:\users\swati\anaconda3\lib\site-packages (from geopandas) (2.0.4) Requirement already satisfied: attrs>=19.2.0 in c:\users\swati\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (23.1.0) Requirement already satisfied: certifi in c:\users\swati\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (2024.2.2) Requirement already satisfied: click~=8.0 in c:\users\swati\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (8.1.7) Requirement already satisfied: click-plugins>=1.0 in c:\users\swati\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (1.1.1) Requirement already satisfied: cligj>=0.5 in c:\users\swati\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (0.7.2) Requirement already satisfied: six in c:\users\swati\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (1.16.0) Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\swati\anaconda3\lib\site-packages (from pandas>=1.4.0->geopandas) (2.8.2) Requirement already satisfied: pytz>=2020.1 in c:\users\swati\anaconda3\lib\site-packages (from pandas>=1.4.0->geopandas) (2023.3.post1) Requirement already satisfied: tzdata>=2022.1 in c:\users\swati\anaconda3\lib\site-packages (from pandas>=1.4.0->geopandas) (2023.3) Requirement already satisfied: colorama in c:\users\swati\anaconda3\lib\site-packages (from click~=8.0->fiona>=1.8.21->geopandas) (0.4.6) Note: you may need to restart the kernel to use updated packages.
pip install geopy
Requirement already satisfied: geopy in c:\users\swati\anaconda3\lib\site-packages (2.4.1) Requirement already satisfied: geographiclib<3,>=1.52 in c:\users\swati\anaconda3\lib\site-packages (from geopy) (2.0) Note: you may need to restart the kernel to use updated packages.
pip install folium
Requirement already satisfied: folium in c:\users\swati\anaconda3\lib\site-packages (0.16.0) Requirement already satisfied: branca>=0.6.0 in c:\users\swati\anaconda3\lib\site-packages (from folium) (0.7.2) Requirement already satisfied: jinja2>=2.9 in c:\users\swati\anaconda3\lib\site-packages (from folium) (3.1.3) Requirement already satisfied: numpy in c:\users\swati\anaconda3\lib\site-packages (from folium) (1.26.4) Requirement already satisfied: requests in c:\users\swati\anaconda3\lib\site-packages (from folium) (2.31.0) Requirement already satisfied: xyzservices in c:\users\swati\anaconda3\lib\site-packages (from folium) (2022.9.0) Requirement already satisfied: MarkupSafe>=2.0 in c:\users\swati\anaconda3\lib\site-packages (from jinja2>=2.9->folium) (2.1.3) Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\swati\anaconda3\lib\site-packages (from requests->folium) (2.0.4) Requirement already satisfied: idna<4,>=2.5 in c:\users\swati\anaconda3\lib\site-packages (from requests->folium) (3.4) Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\swati\anaconda3\lib\site-packages (from requests->folium) (1.26.18) Requirement already satisfied: certifi>=2017.4.17 in c:\users\swati\anaconda3\lib\site-packages (from requests->folium) (2024.2.2) Note: you may need to restart the kernel to use updated packages.
import pandas as pd
import geopandas as gpd
import folium
from folium import Choropleth, Circle, Marker
from folium.plugins import HeatMap, MarkerCluster
from geopy.geocoders import Nominatim
Project 1: Analyzing Impact of Toxic gas release in Pennsylvania¶
The following dataset from the US Environmental Protection Agency (EPA) tracks the release of toxic chemical in Philadelphia, Pennsylvania, USA.
releases = gpd.read_file(r"C:\Users\swati\Desktop\DATA ANALYSIS\Introduction to Geospatial analysis\toxic_release_pennsylvania\toxic_release_pennsylvania\toxic_release_pennsylvania.shp")
releases.head()
| YEAR | CITY | COUNTY | ST | LATITUDE | LONGITUDE | CHEMICAL | UNIT_OF_ME | TOTAL_RELE | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2016 | PHILADELPHIA | PHILADELPHIA | PA | 40.005901 | -75.072103 | FORMIC ACID | Pounds | 0.160 | POINT (2718560.227 256380.179) |
| 1 | 2016 | PHILADELPHIA | PHILADELPHIA | PA | 39.920120 | -75.146410 | ETHYLENE GLYCOL | Pounds | 13353.480 | POINT (2698674.606 224522.905) |
| 2 | 2016 | PHILADELPHIA | PHILADELPHIA | PA | 40.023880 | -75.220450 | CERTAIN GLYCOL ETHERS | Pounds | 104.135 | POINT (2676833.394 261701.856) |
| 3 | 2016 | PHILADELPHIA | PHILADELPHIA | PA | 39.913540 | -75.198890 | LEAD COMPOUNDS | Pounds | 1730.280 | POINT (2684030.004 221697.388) |
| 4 | 2016 | PHILADELPHIA | PHILADELPHIA | PA | 39.913540 | -75.198890 | BENZENE | Pounds | 39863.290 | POINT (2684030.004 221697.388) |
The below dataset consists readings from air quality stations across the city.
stations = gpd.read_file(r"C:\Users\swati\Desktop\DATA ANALYSIS\Introduction to Geospatial analysis\PhillyHealth_Air_Monitoring_Stations\PhillyHealth_Air_Monitoring_Stations\PhillyHealth_Air_Monitoring_Stations.shp")
stations.head()
| SITE_NAME | ADDRESS | BLACK_CARB | ULTRAFINE_ | CO | SO2 | OZONE | NO2 | NOY_NO | PM10 | ... | PAMS_VOC | TSP_11101 | TSP_METALS | TSP_LEAD | TOXICS_TO1 | MET | COMMUNITY_ | LATITUDE | LONGITUDE | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | LAB | 1501 East Lycoming Avenue | N | N | Y | N | Y | Y | Y | N | ... | Y | N | Y | N | y | N | N | 40.008606 | -75.097624 | POINT (2711384.641 257149.310) |
| 1 | ROX | Eva and Dearnley Streets | N | N | N | N | N | N | N | N | ... | N | N | Y | N | Y | N | N | 40.050461 | -75.236966 | POINT (2671934.290 271248.900) |
| 2 | NEA | Grant Avenue and Ashton Street | N | N | N | N | Y | N | N | N | ... | N | N | N | N | N | Y | N | 40.072073 | -75.013128 | POINT (2734326.638 280980.247) |
| 3 | CHS | 500 South Broad Street | N | N | N | N | N | N | N | N | ... | N | N | Y | N | Y | N | N | 39.944510 | -75.165442 | POINT (2693078.580 233247.101) |
| 4 | NEW | 2861 Lewis Street | N | N | Y | Y | Y | N | Y | Y | ... | N | Y | N | Y | N | Y | N | 39.991688 | -75.080378 | POINT (2716399.773 251134.976) |
5 rows × 24 columns
Measuring distance¶
Measuring the distance between the plant that releases the toxic gas and the air quality monitoring stations.
print(stations.crs)
print(releases.crs)
EPSG:2272 EPSG:2272
# Selecting one release incident for analysis
recent_release = releases.iloc[360]
recent_release
YEAR 2012 CITY PHILADELPHIA COUNTY PHILADELPHIA ST PA LATITUDE 39.91354 LONGITUDE -75.19889 CHEMICAL SULFURIC ACID (1994 AND AFTER ACID AEROSOLS" O... UNIT_OF_ME Pounds TOTAL_RELE 365600.0 geometry POINT (2684030.0040213093 221697.3882902659) Name: 360, dtype: object
# Measure distance from release to each station
distances = stations.geometry.distance(recent_release.geometry)
distances
0 44778.509761 1 51006.456589 2 77744.509207 3 14672.170878 4 43753.554393 5 4711.658655 6 23197.430858 7 12072.823097 8 79081.825506 9 3780.623591 10 27577.474903 11 19818.381002 dtype: float64
The mean distance of the release to air quality monitoring station:
print('Mean distance to monitoring stations: {} feet'.format(distances.mean()))
Mean distance to monitoring stations: 33516.28487007786 feet
The closest monitoring station:
print('Closest monitoring station ({} feet):'.format(distances.min()))
print(stations.iloc[distances.idxmin()][["ADDRESS", "LATITUDE", "LONGITUDE"]])
Closest monitoring station (3780.623590556444 feet): ADDRESS 3100 Penrose Ferry Road LATITUDE 39.91279 LONGITUDE -75.185448 Name: 9, dtype: object
Creating a buffer¶
Creating a buffer to understand how many toxic gas releasing plants are within 2 miles of any air quality monitoring station.
two_mile_buffer = stations.geometry.buffer(2*5280)
two_mile_buffer.head()
0 POLYGON ((2721944.641 257149.310, 2721893.792 ... 1 POLYGON ((2682494.290 271248.900, 2682443.441 ... 2 POLYGON ((2744886.638 280980.247, 2744835.789 ... 3 POLYGON ((2703638.580 233247.101, 2703587.731 ... 4 POLYGON ((2726959.773 251134.976, 2726908.924 ... dtype: geometry
# Create map with release incidents and monitoring stations
m = folium.Map(location=[39.9526,-75.1652], zoom_start=11)
HeatMap(data=releases[['LATITUDE', 'LONGITUDE']], radius=15).add_to(m)
for idx, row in stations.iterrows():
Marker([row['LATITUDE'], row['LONGITUDE']]).add_to(m)
m
# Plot each polygon on the map
folium.GeoJson(two_mile_buffer.to_crs(epsg=4326)).add_to(m)
m
Identifying whether release occured within 2 miles of air quality monitoring station:
# Turn group of polygons into single multipolygon
my_union = two_mile_buffer.geometry.unary_union
print('Type:', type(my_union))
# Show the MultiPolygon object
my_union
Type: <class 'shapely.geometry.multipolygon.MultiPolygon'>
The release considered in this study does exist within 2 miles of a station.
my_union.contains(releases.iloc[360].geometry)
True
But not all releases occur within 2 miles:
# The closest station is more than two miles away
my_union.contains(releases.iloc[358].geometry)
False
Project 2: Hospital Proximity Analysis¶
Analysing how many hospitals are available to respond to crash collision accidents in New York City.
collisions = gpd.read_file(r"C:\Users\swati\Desktop\DATA ANALYSIS\Introduction to Geospatial analysis\NYPD_Motor_Vehicle_Collisions\NYPD_Motor_Vehicle_Collisions\NYPD_Motor_Vehicle_Collisions.shp")
collisions.head()
| DATE | TIME | BOROUGH | ZIP CODE | LATITUDE | LONGITUDE | LOCATION | ON STREET | CROSS STRE | OFF STREET | ... | CONTRIBU_2 | CONTRIBU_3 | CONTRIBU_4 | UNIQUE KEY | VEHICLE TY | VEHICLE _1 | VEHICLE _2 | VEHICLE _3 | VEHICLE _4 | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 07/30/2019 | 0:00 | BRONX | 10464 | 40.841100 | -73.784960 | (40.8411, -73.78496) | None | None | 121 PILOT STREET | ... | Unspecified | None | None | 4180045 | Sedan | Station Wagon/Sport Utility Vehicle | Station Wagon/Sport Utility Vehicle | None | None | POINT (1043750.211 245785.815) |
| 1 | 07/30/2019 | 0:10 | QUEENS | 11423 | 40.710827 | -73.770660 | (40.710827, -73.77066) | JAMAICA AVENUE | 188 STREET | None | ... | None | None | None | 4180007 | Sedan | Sedan | None | None | None | POINT (1047831.185 198333.171) |
| 2 | 07/30/2019 | 0:25 | None | None | 40.880318 | -73.841286 | (40.880318, -73.841286) | BOSTON ROAD | None | None | ... | None | None | None | 4179575 | Sedan | Station Wagon/Sport Utility Vehicle | None | None | None | POINT (1028139.293 260041.178) |
| 3 | 07/30/2019 | 0:35 | MANHATTAN | 10036 | 40.756744 | -73.984590 | (40.756744, -73.98459) | None | None | 155 WEST 44 STREET | ... | None | None | None | 4179544 | Box Truck | Station Wagon/Sport Utility Vehicle | None | None | None | POINT (988519.261 214979.320) |
| 4 | 07/30/2019 | 10:00 | BROOKLYN | 11223 | 40.600090 | -73.965910 | (40.60009, -73.96591) | AVENUE T | OCEAN PARKWAY | None | ... | None | None | None | 4180660 | Station Wagon/Sport Utility Vehicle | Bike | None | None | None | POINT (993716.669 157907.212) |
5 rows × 30 columns
Collision hotspots of New York city:
m_1 = folium.Map(location=[40.7, -74], zoom_start=11)
HeatMap(data=collisions[['LATITUDE','LONGITUDE']],radius=10).add_to(m_1)
m_1
Understanding Hospital Coverage¶
The below dataset consists of all the hospitals in New York City:
hospitals = gpd.read_file(r"C:\Users\swati\Desktop\DATA ANALYSIS\Introduction to Geospatial analysis\nyu_2451_34494\nyu_2451_34494\nyu_2451_34494.shp")
hospitals.head()
| id | name | address | zip | factype | facname | capacity | capname | bcode | xcoord | ycoord | latitude | longitude | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 317000001H1178 | BRONX-LEBANON HOSPITAL CENTER - CONCOURSE DIVI... | 1650 Grand Concourse | 10457 | 3102 | Hospital | 415 | Beds | 36005 | 1008872.0 | 246596.0 | 40.843490 | -73.911010 | POINT (1008872.000 246596.000) |
| 1 | 317000001H1164 | BRONX-LEBANON HOSPITAL CENTER - FULTON DIVISION | 1276 Fulton Ave | 10456 | 3102 | Hospital | 164 | Beds | 36005 | 1011044.0 | 242204.0 | 40.831429 | -73.903178 | POINT (1011044.000 242204.000) |
| 2 | 317000011H1175 | CALVARY HOSPITAL INC | 1740-70 Eastchester Rd | 10461 | 3102 | Hospital | 225 | Beds | 36005 | 1027505.0 | 248287.0 | 40.848060 | -73.843656 | POINT (1027505.000 248287.000) |
| 3 | 317000002H1165 | JACOBI MEDICAL CENTER | 1400 Pelham Pkwy | 10461 | 3102 | Hospital | 457 | Beds | 36005 | 1027042.0 | 251065.0 | 40.855687 | -73.845311 | POINT (1027042.000 251065.000) |
| 4 | 317000008H1172 | LINCOLN MEDICAL & MENTAL HEALTH CENTER | 234 E 149 St | 10451 | 3102 | Hospital | 362 | Beds | 36005 | 1005154.0 | 236853.0 | 40.816758 | -73.924478 | POINT (1005154.000 236853.000) |
Percentage of collisions more than 10km away from closest hospital¶
m_2 = folium.Map(location=[40.7, -74], zoom_start=11)
# Your code here: Visualize the hospital locations
for idx,row in hospitals.iterrows():
Marker([row['latitude'],row['longitude']],popup=row['name']).add_to(m_1)
m_1
coverage = gpd.GeoDataFrame(geometry=hospitals.geometry).buffer(10000)
my_union = coverage.geometry.unary_union
outside_range = collisions.loc[~collisions["geometry"].apply(lambda x: my_union.contains(x))]
percentage = round(100*len(outside_range)/len(collisions), 2)
print("Percentage of collisions more than 10 km away from the closest hospital: {}%".format(percentage))
Percentage of collisions more than 10 km away from the closest hospital: 15.12%
Identifying the best potential hospital choice for collisions that occur in distant location:
def best_hospital(collision_location):
idx_min = hospitals.geometry.distance(collision_location).idxmin()
my_hospital = hospitals.iloc[idx_min]
name = my_hospital["name"]
return name
print(best_hospital(outside_range.geometry.iloc[0]))
CALVARY HOSPITAL INC
Identifying hospital that have the highest demand of distant collision cases:
highest_demand = outside_range.geometry.apply(best_hospital).value_counts().idxmax()
highest_demand
'JAMAICA HOSPITAL MEDICAL CENTER'